Recognition of Linear Context-Free Rewriting Systems

نویسنده

  • Giorgio Satta
چکیده

The class of linear context-free rewriting systems has been introduced as a generalization of a class of grammar formalisms known as mildly context-sensitive. The recognition problem for linear context-free rewriting languages is studied at length here, presenting evidence that, even in some restricted cases, it cannot be solved efficiently. This entails the existence of a gap between, for example, tree adjoining languages and the subclass of linear context-free rewriting languages that generalizes the former class; such a gap is attributed to "crossing configurations". A few other interesting consequences of the main result are discussed, that concern the recognition problem for linear context-free rewriting languages. 1 I N T R O D U C T I O N Beginning with the late 70's, there has been a considerable interest within the computational linguistics field for rewriting systems that enlarge the generative power of context-free grammars (CFG) both from the weak and the strong perspective, still remaining far below the power of the class of contextsensitive grammars (CSG). The denomination of mildly context-sensitive (MCS) has been proposed for the class of the studied systems (see [Joshi et al., 1991] for discussion). The rather surprising fact that many of these systems have been shown to be weakly equivalent has led researchers to generalize *I am indebted to Anuj Dawax, Shyam Kaput and Owen Rainbow for technical discussion on this work. I am also grateful to Aravind Joshi for his support in this research. None of these people is responsible for any error in this work. This research was partially funded by the following grants: ARO grant DAAL 03-89-C-0031, DARPA grant N00014-90J-1863, NSF grant IRI 90-16592 and Ben Franklin grant 91S.3078C-1. 89 the elementary operations involved in only apparently different formalisms, with the aim of capturing the underlying similarities. The most remarkable attempts in such a direction are found in [VijayShanker et al., 1987] and [Weir, 1988] with the introduction of linear context-free rewriting systems (LCFRS) and in [Kasami et al., 1987] and [Seki et a/., 1989] with the definition of multiple context-free grammars (MCFG); both these classes have been inspired by the much more powerful class of generalized context-free grammars (GCFG; see [Pollard, 1984]). In the definition of these classes, the generalization goal has been combined with few theoretically motivated constraints, among which the requirement of efficient parsability; this paper is concerned with such a requirement. We show that from the perpective of efficient parsability, a gap is still found between MCS and some subclasses of LCFRS. More precisely, the class of LCFRS is carefully studied along two interesting dimensions, to be precisely defined in the following: a) the fan-out of the grammar and b) the production length. From previous work (see [Vijay-Shanker et al., 1987]) we know that the recognition problem for LCFRS is in P when both dimensions are bounded. 1 We complete the picture by observing NP-hardness for all the three remaining cases. If P~NP, our result reveals an undesired dissimilarity between well known formalisms like TAG, HG, LIG and others for which the recognition problem is known to be in P (see [VijayShanker, 1987] and [Vijay-Shanker and Weir, 1992]) and the subclass of LCFRS that is intended to generalize these formalisms. We investigate the source of the suspected additional complexity and derive some other practical consequences from the obtained resuits. 1 p is the class of all languages decidable in deterministic polynomial time; NP is the class of all languages decidable in nondeterministic polynomial time. 2 T E C H N I C A L R E S U L T S This section presents two technical results that are . the most important in this paper. A full discussion of some interesting implications for recognition and parsing is deferred to Section 3. Due to the scope of the paper, proofs of Theorems 1 and 2 below are not carried out in all their details: we only present formal specifications for the studied reductions and discuss the intuitive ideas behind them. 2.1 P R E L I M I N A R I E S Different formalisms in which rewriting is applied independently of the context have been proposed in computational linguistics for the treatment of Natural Language, where the definition of elementary rewriting operation varies from system to system. The class of linear context-free rewriting systems (LCFRS) has been defined in [Vijay-Shanker et al., 1987] with the intention of capturing through a generalization common properties that are shared by all these formalisms. The basic idea underlying the definition of LCFRS is to impose two major restrictions on rewriting. First of all, rewriting operations are applied in the derivation of a string in a way that is independent of the context. As a second restriction, rewriting operations are generalized by means of abstract composition operations that are linear and nonerasing. In a LCFR system, both restrictions are realized by defining an underlying context-free grammar where each production is associated with a function that encodes a composition operation having the above properties. The following definition is essentially the same as the one proposed in [Vijay-Shanker et al.,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parsing Linear-Context Free Rewriting Systems with Fast Matrix Multiplication

We describe a recognition algorithm for a subset of binary linear context-free rewriting systems (LCFRS) with running time O(nωd) where M(m) = O(m ) is the running time for m×m matrix multiplication and d is the “contact rank” of the LCFRS—the maximal number of combination and non-combination points that appear in the grammar rules. We also show that this algorithm can be used as a subroutine t...

متن کامل

Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems

The use of well-nested linear context-free rewriting systems has been empirically motivated for modeling of the syntax of languages with discontinuous constituents or relatively free word order. We present a chart-based parsing algorithm that asymptotically improves the known running time upper bound for this class of rewriting systems. Our result is obtained through a linear space construction...

متن کامل

Prefix Probabilities for Linear Context-Free Rewriting Systems

We present a novel method for the computation of prefix probabilities for linear context-free rewriting systems. Our approach streamlines previous procedures to compute prefix probabilities for context-free grammars, synchronous context-free grammars and tree adjoining grammars. In addition, the methodology is general enough to be used for a wider range of problems involving, for example, sever...

متن کامل

On the Expressive Power of Abstract Categorial Grammars: Representing Context-Free Formalisms

We show how to encode context-free string grammars, linear contextfree tree grammars, and linear context-free rewriting systems as Abstract Categorial Grammars. These three encodings share the same constructs, the only difference being the interpretation of the composition of the production rules. It is interpreted as a first-order operation in the case of context-free string grammars, as a sec...

متن کامل

Linear Context-Free Rewriting Systems and Deterministic Tree-Walking Transducers

I n t r o d u c t i o n In [9] a comparison was made of the generative capacity of a number of grammar formalisms. Several were found to share a number of characteristics (described below) and the class of such formalisms was called linear context-free rewriting systems. This paper shows how the class of string languages generated by linear context-free rewriting systems relates to a number of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992